Percolate query optimization: Fetch fields mentioned in queries instead of entire doc and batch percolate query by heap-based threshold #1331

eirsep · 2023-12-20T08:52:16Z

Issue #, if available:
#1353
#1367

Description of changes:

This PR optimizes scalability of percolate query performed in doc level monitors.

Status quo behaviour

The behaviour is to query and collect data from all shards of each index and perform a percolate query on the docs.
Hence the minimum number of queries performed in each execution is equal to number of concrete indices being queried.
This was not scaling as the data node executing the doc level would exceed heap memory limits when docs held in memory from all shards in the index exceed memory.

New behaviour introduced in the PR

We introduce a setting plugins.alerting.monitor.percolate_query_docs_size_memory_percentage_limit that allows us to determine the maximum size of the source data in memory allowed before we need to perform a percolate query.
That way the minimum number of percolate queries is 1 irrespective of number of concrete indices being queried and the number of percolate queries performed in a given execution is now more deterministic as it's a function of the heap size of the data node and won't be per concrete index.

CheckList:

Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

sbcd90 · 2023-12-21T00:22:41Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

+                        matchingDocIdsPerIndex?.get(concreteIndexName),
+                        monitorMetadata,
+                        inputRunResults,
+                        docsToQueries


we keep collecting doc ids to queries here & run 1 finding creation, trigger execution workflow https://github.com/opensearch-project/alerting/pull/1331/files#diff-68866b22ed9703814b4d5db8d3488872bcb972086ecaca10c9b8bfd54db981bcR237.
Do we also execute finding creation, trigger execution workflow at each shard level?

trigger execution workflows needs all results.
Findings we can brainstorm if it needs to be done at shard level
But currently findings are being indexed one by one. A bigger and more important optimization would be to ingest them batch wise using bulk requests: #1333

eirsep · 2023-12-22T06:44:35Z

will explore how I can factor in available free memory at that point in time and we can decide on the acceptable fraction of the heap size and if the data accumulated is crossing that threshold we can submit the percolate query with that many docs.
this way we can do for multiple indices also at once as long as it is lesser than threshold and choose executing another percolate query more deterministically

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

asuresh8

Why no unit tests?

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

asuresh8 · 2024-01-04T14:33:35Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

@@ -219,41 +233,39 @@ object DocumentLevelMonitorRunner : MonitorRunner() {
                    // Prepare DocumentExecutionContext for each index
                    val docExecutionContext = DocumentExecutionContext(queries, indexLastRunContext, indexUpdatedRunContext)

-                    val matchingDocs = getMatchingDocs(
+                    fetchShardDataAndMaybeExecutePercolateQueries(


This method has a lot of parameters. Consider using an object as the parameter. That way a builder could be used so that the order of arguments does not matter. Would also make this more readable.

asuresh8 · 2024-01-04T14:34:17Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

                }
            }
+            /* if all indices are covered still in-memory docs size limit is not breached we would need to submit
+         the percolate query at the end*/


Nit indentation looks off.

asuresh8 · 2024-01-04T14:43:34Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

+     *
+     */
+    private fun isInMemoryDocsSizeExceedingMemoryLimit(docsBytesSize: Long, monitorCtx: MonitorRunnerExecutionContext): Boolean {
+        var thresholdPercentage = PERCOLATE_QUERY_DOCS_SIZE_MEMORY_PERCENTAGE_LIMIT.get(monitorCtx.settings)


It seems inefficient to compute the threshold every time this method is called. Is it possible to compute the threshold only when the monitor is initialized?

threshold is a dynamic setting so we need to fetch the latest value everytime we evaluate. opensearch does soms setting caching in the background so we are not essentially computing the threshold

asuresh8 · 2024-01-04T14:45:30Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

+        if (thresholdPercentage > 100 || thresholdPercentage < 0) {
+            thresholdPercentage = PERCOLATE_QUERY_DOCS_SIZE_MEMORY_PERCENTAGE_LIMIT.getDefault(monitorCtx.settings)
+        }
+        val heapMaxBytes = monitorCtx.jvmStats!!.mem.heapMax.bytes


double ! is not safe. Consider replacing this with the appropriate null check.

engechas · 2024-01-04T03:55:43Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

+     */
+    private fun isInMemoryDocsSizeExceedingMemoryLimit(docsBytesSize: Long, monitorCtx: MonitorRunnerExecutionContext): Boolean {
+        var thresholdPercentage = PERCOLATE_QUERY_DOCS_SIZE_MEMORY_PERCENTAGE_LIMIT.get(monitorCtx.settings)
+        if (thresholdPercentage > 100 || thresholdPercentage < 0) {


Can this be validated when the setting is set rather than here?

engechas · 2024-01-04T03:57:07Z

alerting/src/main/kotlin/org/opensearch/alerting/settings/AlertingSettings.kt

+         * level monitor execution. The docs are being collected from searching on shards of indices mentioned in the
+         * monitor input indices field.
+         */
+        val PERCOLATE_QUERY_DOCS_SIZE_MEMORY_PERCENTAGE_LIMIT = Setting.intSetting(


Minor: MEMORY -> HEAP

engechas · 2024-01-04T19:00:29Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

            }
+        } finally { // no catch block because exception is caught and handled in runMonitor() class


What's the need to wrap in try/finally if there's nothing being caught?

engechas · 2024-01-04T19:06:01Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

+                    inputRunResults.getOrPut(id) { mutableSetOf() }.add(docIndex)
+                    docsToQueries.getOrPut(docIndex) { mutableListOf() }.add(id)


Minor: passing a collection to another function so it can be modified can be error prone. What do you think about returning these collections from this method as fields in a object then populating the original inputRunResults and docsToQueries collections with the return value of this method?

engechas · 2024-01-04T19:06:45Z

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt

+            transformedDocs.clear()
+            docsSizeInBytes.set(0)


Similar comment on passing to another function then mutating

eirsep · 2024-01-10T20:35:04Z

@sbcd90
Should we also add another condition to do batching of percolate query based on num docs and limit to 100k per search at max along with the memory limit.

That way both conditions are evaluated to decide if we want to perform percolate query on current doc set

… each shard instead of performing one percolate query on docs from all shards Signed-off-by: Surya Sashank Nistala <[email protected]>

…riesPerShard() Signed-off-by: Surya Sashank Nistala <[email protected]>

Signed-off-by: Surya Sashank Nistala <[email protected]>

… percolate query Signed-off-by: Surya Sashank Nistala <[email protected]>

…end of collecting data from all indices Signed-off-by: Surya Sashank Nistala <[email protected]>

…ex is a text field in query index mapping Signed-off-by: Surya Sashank Nistala <[email protected]>

… query should be performed immediately Signed-off-by: Surya Sashank Nistala <[email protected]>

Signed-off-by: Surya Sashank Nistala <[email protected]>

…monitor to submit to percolate query instead of docs_source Signed-off-by: Surya Sashank Nistala <[email protected]>

Signed-off-by: Surya Sashank Nistala <[email protected]>

…as is configured Signed-off-by: Surya Sashank Nistala <[email protected]>

…into percolate_per_shard

eirsep · 2024-02-19T10:34:22Z

Closing in favor of smaller PRs

eirsep requested review from lezzago, AWSHurneyt, sbcd90, getsaurabh02, praveensameneni, qreshi, bowenlan-amzn and rishabhmaurya as code owners December 20, 2023 08:52

engechas reviewed Dec 20, 2023

View reviewed changes

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt Outdated Show resolved Hide resolved

goyamegh reviewed Dec 20, 2023

View reviewed changes

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt Show resolved Hide resolved

goyamegh reviewed Dec 20, 2023

View reviewed changes

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt Outdated Show resolved Hide resolved

engechas approved these changes Dec 21, 2023

View reviewed changes

sbcd90 reviewed Dec 21, 2023

View reviewed changes

sbcd90 approved these changes Dec 21, 2023

View reviewed changes

lezzago reviewed Dec 22, 2023

View reviewed changes

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt Show resolved Hide resolved

lezzago reviewed Dec 22, 2023

View reviewed changes

alerting/src/main/kotlin/org/opensearch/alerting/DocumentLevelMonitorRunner.kt Show resolved Hide resolved

lezzago approved these changes Dec 22, 2023

View reviewed changes

eirsep mentioned this pull request Dec 22, 2023

Optimize doc level monitor performance: Batch docs for percolate query searches based on available memory and cpu #1353

Closed

eirsep marked this pull request as draft December 22, 2023 23:48

eirsep changed the title ~~Percolate query optimization: Perform percolate query once for each shard's documents~~ Percolate query optimization: Perform percolate query when docs in memory breach threshold Jan 4, 2024

eirsep changed the title ~~Percolate query optimization: Perform percolate query when docs in memory breach threshold~~ Percolate query optimization: Perform percolate query when docs in memory breach threshold setting for batching Jan 4, 2024

eirsep force-pushed the percolate_per_shard branch from f150174 to cb92f54 Compare January 4, 2024 03:05

asuresh8 reviewed Jan 4, 2024

View reviewed changes

engechas reviewed Jan 4, 2024

View reviewed changes

eirsep force-pushed the percolate_per_shard branch from e09669f to 27a6ec4 Compare January 11, 2024 07:36

eirsep added 3 commits January 24, 2024 15:01

refactor doc level monitor to perform a percolate queru for docs from…

914ca9a

… each shard instead of performing one percolate query on docs from all shards Signed-off-by: Surya Sashank Nistala <[email protected]>

move transformedDocs into for loop in fetchDataAndExecutePercolateQue…

d0dd9c4

…riesPerShard() Signed-off-by: Surya Sashank Nistala <[email protected]>

rename percolate query execution method

58d55b2

Signed-off-by: Surya Sashank Nistala <[email protected]>

eirsep added 8 commits January 24, 2024 15:01

fix log messages

b86a403

Signed-off-by: Surya Sashank Nistala <[email protected]>

add setting for percolate query docs size memory threshold to perform…

6f2990f

… percolate query Signed-off-by: Surya Sashank Nistala <[email protected]>

perform percolate query only if threshold setting breached or at the …

92feae0

…end of collecting data from all indices Signed-off-by: Surya Sashank Nistala <[email protected]>

change terms query clause on indices to list of should clauses as ind…

bfea7e6

…ex is a text field in query index mapping Signed-off-by: Surya Sashank Nistala <[email protected]>

add setting and check on num docs in memory to determine if percolate…

80c64df

… query should be performed immediately Signed-off-by: Surya Sashank Nistala <[email protected]>

use updated index names while process findings

5601565

Signed-off-by: Surya Sashank Nistala <[email protected]>

register setting in plugin class

be29a78

Signed-off-by: Surya Sashank Nistala <[email protected]>

optimize to fetch only relevant fields from source data in doc level …

009cd61

…monitor to submit to percolate query instead of docs_source Signed-off-by: Surya Sashank Nistala <[email protected]>

eirsep force-pushed the percolate_per_shard branch from b10e56e to 009cd61 Compare January 25, 2024 01:23

query shard up to max sequence number instead of just 10000

743bda8

Signed-off-by: Surya Sashank Nistala <[email protected]>

eirsep changed the title ~~Percolate query optimization: Perform percolate query when docs in memory breach threshold setting for batching~~ Percolate query optimization: Fetch fields mentioned in queries instead of entire doc and batch percolate query by heap-based threshold Jan 31, 2024

eirsep added 2 commits February 2, 2024 01:09

add stats

d324f8d

Signed-off-by: Surya Sashank Nistala <[email protected]>

optimize on compute max seq nos and remove all redundant search requests

da6afbc

Signed-off-by: Surya Sashank Nistala <[email protected]>

eirsep force-pushed the percolate_per_shard branch from b439559 to da6afbc Compare February 8, 2024 23:53

eirsep added 3 commits February 8, 2024 15:55

todo

d7580b3

Signed-off-by: Surya Sashank Nistala <[email protected]>

bucket level monitor: opitmize to read only from write index when ali…

ad57b56

…as is configured Signed-off-by: Surya Sashank Nistala <[email protected]>

Merge branch 'main' of https://github.com/opensearch-project/alerting …

924aed6

…into percolate_per_shard

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Percolate query optimization: Fetch fields mentioned in queries instead of entire doc and batch percolate query by heap-based threshold #1331

Percolate query optimization: Fetch fields mentioned in queries instead of entire doc and batch percolate query by heap-based threshold #1331

eirsep commented Dec 20, 2023 •

edited

Loading

sbcd90 Dec 21, 2023

eirsep Dec 21, 2023

eirsep commented Dec 22, 2023

asuresh8 left a comment

asuresh8 Jan 4, 2024

asuresh8 Jan 4, 2024

asuresh8 Jan 4, 2024

eirsep Jan 4, 2024

asuresh8 Jan 4, 2024

engechas Jan 4, 2024

engechas Jan 4, 2024

engechas Jan 4, 2024

engechas Jan 4, 2024

engechas Jan 4, 2024

eirsep commented Jan 10, 2024

eirsep commented Feb 19, 2024

		}
		} finally { // no catch block because exception is caught and handled in runMonitor() class

		inputRunResults.getOrPut(id) { mutableSetOf() }.add(docIndex)
		docsToQueries.getOrPut(docIndex) { mutableListOf() }.add(id)

Percolate query optimization: Fetch fields mentioned in queries instead of entire doc and batch percolate query by heap-based threshold #1331

Are you sure you want to change the base?

Percolate query optimization: Fetch fields mentioned in queries instead of entire doc and batch percolate query by heap-based threshold #1331

Conversation

eirsep commented Dec 20, 2023 • edited Loading

Status quo behaviour

New behaviour introduced in the PR

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eirsep commented Dec 22, 2023

asuresh8 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eirsep commented Jan 10, 2024

eirsep commented Feb 19, 2024

eirsep commented Dec 20, 2023 •

edited

Loading